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Executive Summary: Genetic Evidence on South Asian Migrations 


Academic scholarship in many disciplines — linguistics, anthropology, paleontology, 
history, and the history of religions — offers significant evidence for the migration of 
groups conventionally known as Indo-Europeans into South Asia (a process at times 
referred to as “Aryan migration”). The extent of this migration and its impact on the 
cultural and religious heritage of the region is contested today, as scholars develop new 
tools for the analysis of the ancient past and as varied religious and political movements 
attempt to achieve legitimacy for their own distinct interpretations of this past. 

In the study of ancient South Asia as in other regions of the world, genetic studies have 
recently received much received much attention. Over the last two decades, genetic 
technology has made it possible to trace certain aspects of genetic lineages, and such 
studies focus on DNA sequences spanning autosomal, mitochondrial, and Y 
chromosomal genomic regions. 

However, for a variety of reasons, genetic studies have provided results that are 
contradictory or difficult to interpret. At present there is no consensus in the field and the 
evidence does not provide conclusive evidence on the genetic origins and demography of 
Indian populations, either into or out of South Asia. 

Therefore we recommend that the Board of Education and its Committees continue to 
exercise due caution and avoid overturning accepted historical, linguistic, and 
archeological evidence in favor of contested population genetics claims. 

Several methodological difficulties mark this literature: 

■ The South Asian subcontinent is extremely heterogeneous. According to the 
Anthropological Survey of India, there are more than 4,600 distinct communities in the 
country. It should thus come as no suiprise that sampling such a large and diverse set of 
populations may not yield quick or easy answers. 

■ Sample sizes are often quite small. In sampling a diverse region like South Asia, 
individual castes or tribes are often represented by extremely small samples consisting of 
as few as one, two, or four individuals. Many of the samples are also “data-banked’ and 
there is uneven information about how samples were collected, the kinds of questions 
posed to donors, etcetera. Therefore more data is necessary before broad claims can be 
persuasively mounted. 

■ Studies use different genomic markers that may result in different conclusions on origins 
and demography. Some focus on certain haplotypes in the mtDNA and/or Y- 
chromosomes and/or autosomal sites. There is no consensus yet on which sites produce 
the best estimates of genetic genealogies. 

■ Studies use distinct methodologies. Across studies, different statistical measures and 
computational models are applied to analyze data. While there is some agreement on 
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which methods afford the most reliable results, many studies use methods that are not 
statistically consistent or have not been shown to perform well under simulation. 


■ Studies vary in the patterns they find (even when studying the same source of DNA). 

■ Data is always interpreted and there is disagreement on the interpretation of this data. Patterns 
may result because of demographic shifts or because of population bottlenecks, genetic drift, 
differing selection pressures, multiple migrations, etcetera. Studying extant DNA alone is not 
always conclusive as to why certain patterns emerge. 

■ These technologies are new and analyses are based on several assumptions. It is probable 
that as the field matures, some basic assumptions will change. These include assumptions 
about the mutation rates of mtDNA and Y chromosomes (recent studies find that 
mutation rates and evolutionary pressures in different regions can vary, troubling 
assumptions about the “average mitochondrial clock” used in studies), 32 about whether 
mtDNA recombines (most studies assume that it does not but some studies suggest 
otherwise), assumptions about whether there is ever paternal mtDNA transmission (again 
all studies assume there is not while there are cases where this is known to occur.) 

■ Y chromosome and mtDNA are unique in what they tell us. Their strength, in telling an 
uninterrupted story about paternal genealogy (father -father-father-father in the case of Y 
chromosomes) or maternal genealogy (mother-mother-mother in the case of mtDNA), is 
also a weakness. In each generation, we sample a smaller fraction of the ancestral 
lineage. While one samples half in Generation 1, one samples only 1/16,384 in 
Generation 14. The remaining 16,383 are absent in the data. Therefore while powerful in 
what they can tell us, they also do not tell us a lot about history. Y-chromosomes 
represent less than one percent of our genome. Complex or accidental acts of history may 
forever shift the genetic data. These are partial and selective histories. 

■ While these studies use scientific methods, there are many social and cultural 
assumptions embedded in them, including outdated understandings of the modern caste 
system that equate it with the Varna system of ancient India, that assume strong 
endogamy, or that view “castes” as stable and easily identifiable. However, social 
scientists have demonstrated convincingly that caste in India is an “elastic” system and 
that two castes that share the same name may have very different origins in different 
geographical regions. 13 Studies of caste in one region are not easily extrapolated to 
others. What counts as an upper or middle or lower “caste” group in one area may not 
translate into the same category in another. Many genetic studies of South Asia depend 
on these and similar erroneous assumptions in the constitution and analysis of data. 

■ Language families (Indo-European, Dravidian, Austro-Asiatic) are often used to identify 
population groups, which tends to reinforce the notion that language families have a 
genetic basis. Yet there is no necessary relationship between language and genes, 
especially in a region like South Asia with a high degree of multilingualism. 

In conclusion, genetic evidence opens a new avenue of understanding genealogies and migration 
patterns in South Asia. However, at the present time, there is considerable variation in the results 
of such studies. At present there is no consensus in the field and studies do not provide 
conclusive evidence on the genetic origins and demography of Indian populations, either 
into or out of South Asia. 
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TITLE VI CENTERS REPORT ON SOURCES OF 
EVIDENCE REGARDING “ARYAN MIGRATION” 

Call for scholarly assessment 

We would like to note, at the outset, how difficult it is to discuss this issue without 
raising heated debate. It is clear that although the Hindutva (Hindu nationalist) movement 
has particular political stakes in arguing for an “indigenous” origin of the so-called 
“Aryan” peoples, our task as scholars is simply to assess the facts. In our view, the 
“Aryan Migration” theory seems to be the most sound given the current state of available 
evidence. This situation may change as the methodologies and arguments of recent 
genetics studies are retested and verified. Our remarks on this subject are thus intended 
to clarify as far as possible, the current state of archeological, linguistic, and genetic 
evidence concerning the “Aryan Migration Theory” at this moment 


1. Archeological and Linguistic Evidence 

In debates about the prehistory of South Asia, a major bone of contention is the so-called 
“Aryan Invasion Theory”, a 19 th -century belief that the linguistic similarities between 
Sanskrit and European languages such as Latin, Greek, or English must be explained in 
terms of an invasion of “Aryan” tribes into India from the west, followed by the 
subjugation of indigenous populations. In its crude “invasionist” form the theory was 
heartily embraced by the British and other Europeans as a parallel to the “civilizing” 
British conquest of India. 

Many of the arguments advanced in favor of the view have been called into question by 
recent advances in scholarship, as well as a move away from Eurocentric and racist ideas 
of civilization as solely originating in the West. This earlier “invasionist” scholarship 
included premature conclusions about “racial” differences between the Aryans and their 
opponents, as well as a historically inappropriate imposition of 19 th -century notions of 
nationalism to a period some three to four millennia earlier. The earlier scholarship 
additionally demonstrated little clarity about the relationship between ethnicity (“Aryan”) 
and language (“Indo-Aryan”). 

Current scholarship presents a more cautious and to our understanding realistic 
interpretation of the evidence. Rather than an invasion by a western, European people, 
there may have been waves of movement by Indo-Aryan speaking pastoralist tribes from 
nearby Central Asia. The assumption that these speakers were “racially different” from 
the indigenous population cannot be sustained on the basis of existing textual evidence. 
Moreover, the widespread acceptance of their language in northern India is no longer 
considered a sign of massive and forceful subjugation. The most plausible current model 
is one of “elite dominance”, where the language of a relatively small numbers of speakers 
with cultural or political prestige is adopted by a much larger indigenous population. (A 
parallel would be that of Turkish in Anatolia. Although the Turks came from Central 
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Asia, the vast majority of the Turkish-speaking population of modem Turkey does not 
look Central Asian, but is substantially eastern Mediterranean in character.) 

Two major arguments for an external origin of the Indo-Aryan language and its earliest 
speakers remain. One is the undeniable linguistic relationship to the other “Indo- 
European” languages of Europe and Asia (Latin, Slavic Greek, Germanic, Persian, and 
many others) and the fact that a migration into India is a simpler hypothesis than the 
alternative assumption that all the languages other than Indo-Aryan migrated out. 

The other argument concerns the importance of what is known as the “horse-culture 
complex” associated with Indo-European technology: two-wheeled battle chariots drawn 
by domesticated horses, as well as the strong religious significance of the horse. No 
credible evidence for this cultural complex exists in South Asia before the 16 th century 
BC; rather the iconography of the earlier Harappan Civilization is dominated by unicorn 
images. The horse-culture complex originates in areas near the Ural Mountains around 
2000 BC, and can be seen as spreading from that area to the east, south, and west. The 
fact that in the Near East the arrival of horse-drawn two-wheeled battle chariots coincides 
with the arrival of Indo-European names, combined with the central cultural and religious 
significance of the horse culture complex in early Indo-European traditions (both in 
South Asia and farther west) strongly suggests that the cultural complex was spread from 
the Ural area by speakers of Indo-European languages. 

The only major counter-argument against this perspective draws on the fact that some 
archaeologists find no evidence for a change of skeletal mix during the entire second 
millennium BC in the northwestern area that would have been settled first by the 
“Aryans.” However, the same archaeologists find no evidence for a change of skeletal 
mix over the entire span of the last 5000 years, in spite of the known later in-migrations 
by groups such as the Sakas, Hunas, and various other Central and West Asian groups. 
Evidently, the spread of languages and cultures does not require the movement of large 
populations (compare again the case of Turkey). This conclusion is compatible with some 
recent genetic studies. 

2. Genetic Evidence with Respect to South Asian Migrations 

Migrations of humans have been a focus of several academic disciplines and fields. 
Linguistic and anthropological evidence strongly suggests the migrations of Indo- 
Europeans (also referred to as “Aryan migration”) who brought their culture and 
language into current day South Asia. Over the last fifteen years, genetic technology has 
made it possible to trace certain aspects of genetic lineages and the field of genetics has 
now entered the landscape of data available to aid our understanding of migration 
patterns in South Asia. 

With respect to South Asia, the genetic evidence is rather large, including individual 
population/region focused studies as well as large studies spanning populations across 
India and sometimes including comparisons to populations across the world. The 
conclusions of the various studies are not uniform and there are many contradictory 


5 



findings as well. In addition, the methodologies underlying these studies are often 
untested, and speculative. 

Most studies sample extant South Asian populations (the current population of India is 
well over a billion at this time) in order to ascertain their relationships to each other and 
trace their historical genealogies. As there are more than 4,600 distinct communities in 
India alone, it should come as no surprise that sampling such a large and diverse set of 
populations may not yield quick or easy answers. 

Before summarizing the various studies, it is important to keep in mind their goals and 
scope. For our purposes, there are two relevant foci in the genetic studies of South Asia: 

1. Indo European (IE) migration/Aryan migration into India: Who are the 

people of India? What do we know of their origin? Given the linguistic and 
anthropological evidence on Aryan migration, geneticists have now taken up this 
question. Questions include whether such a migration even took place; if it did, 
whether the contribution is small or great; the timing of such a migration; and 
finally whether the congruence can be explained by migration out of South Asia 
to other parts rather than the converse. 

Several studies find evidence to support the Aryan migration hypothesis. Most studies 
include one or more of the following sources of DNA: Y chromosomal, mtDNA and 
autosomal DNA; they sample particular geographic regions, castes or tribes as indicated. 
These are compared to each other, other populations of India and the other world 
populations. Below are some brief summaries of articles: 

Bhattacharya et al (1999) 4 focused on 125 individuals from ten ethnic groups largely in eastern 
and northern regions of India. They used 2 biallelic and 6 microsatellite markers on the Y 
chromosome. They found significant haplotype variation between castes and tribes but non- 
significant variation among caste groups. 

Bamshad et al (2001) 1 compared 265 males from eight castes within Andhra Pradesh to 400 
Indians and 350 Africans, Asians, and Europeans. They studied mtDNA (400 bp of hypervariable 
region 1 and 14 RSPs) and Y chromosomes (20 biallelic polymorphisms and 5 STRs). The results 
show support for groups of males with European affinities migrating to the region. Paternally 
inherited Y chromosome variation in each caste is more similar to Europeans than to Asians. 
Moreover, the affinity to Europeans is proportionate to caste rank, upper castes being more similar 
to Europeans, particularly East Europeans. In addition, because of the differences between 
maternal and paternal inheritance patterns, they argue that these results suggest the possibility of 
upward mobility in the caste system for women but not for men. 

Roychowdhary et al (2001 ) 23 focused particularly on tribal groups using autosomal DNA markers, 
mtDNA RSPs and mtDNA hypervariable HVS-1. They find that among autosomal DNA, there is 
correspondence between genomic and linguistic affinities. They particularly contradict Kivisild et 
al (1999). However, ' despite focusing exclusively on tribal populations, the present study found 
these markers entirely absent in their study. Therefore they disagree with Kivisild that “all western 
Eurasian subclusters of haplogroup U were present in India before Aryan speakers.” 

Wells et al (2001) 31 focused on 1935 individuals using 23 biallelic polymorphism haplotypes of 
the Y chromosome with a particular focus on Central Asia. Their data suggest that Central Asia is 
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an important reservoir of genetic diversity, and the source of at least 3 waves of migration leading 
to Europe, the Americas and India. 

Basu et al (2003) 3 studied 58 genetic markers among 44 geographically, linguistically, and 
socially disparate ethnic populations using 10 mtDNA RSPs, 2 markers (one hypervariable 
segment 1, and one ins/del polymorphism) on the Y chromosome as well as 5 markers on 
autosomal DNA. Their data suggest a unity of female lineages in India suggesting that the number 
of female settlers may have been small. They find tribal and caste populations to be highly 
differentiated find evidence of Indo European nomad arrival and suggest that Dravidian tribes 
were possibly widespread before the arrival of the Indo Europeans. Their results support 
Bamshad’s (2001) results that Central Asians are genetically closer to upper caste populations than 
middle or lower castes. 

Quintana-Murci et al (2004) 21 used 910 individuals and focused on mtDNA variation in Central 
Asia. They conclude: 

Our analysis of mtDNA from the southwestern and central Asian corridor shows that the 
highest variation is observed in populations located in the Indus Valley and Central Asia, 
highlighting this region as the place where western Eurasian lineages meet both the South 
Asian and eastern Eurasian genetic strata, respectively. The amalgamation of different 
components in this area may have resulted from successive and continuous waves of 
migration from diverse geographical sources at different time periods, from the early 
human settlements in the region after the “out of Africa’, dispersal to migrations 
associated with the diffusion of new technologies, such as farming and/or pastoral 
nomadism, and accompanied by new languages, like the incursion of Indo -Iranian 
speakers from the northwest. 

Cordaux et al (2004) 1 studied 931 individuals focusing on the Y chromosome (15 tribal and 12 
caste groups, including 155 tribal individuals only from Southern India - 9 groups and 1 caste) as 
well as markers on the mtDNA and also compared these to some published data. They found that 
caste and tribal groups differ significantly in haplogroup distribution. Caste groups are 
homogeneous for Y chromosome variation and more closely related to each other and to the 
Central Asian pool than they are to Indian tribal groups (even though they find evidence for some 
bidirectional flow between caste and tribal populations). Their paternal group data supports IE 
migrations into the caste system. Caste and tribal populations were found to be homogeneous for 
mtDNA variation. 

Rajkumar and Kashyap (2004) ~ 2 focused on autosomal microsatellite 15 STR in 267 individuals 
from 4 caste populations in Karnataka. This study focuses on a small population in Southern India 
and finds similar results that high ranking caste communities show greater affinities to East Asian 
and European ethnic communities. 

Several other studies find no evidence of an Aryan or IE migration: 

Kivisild et al (1999) 12 sampled 550 individuals focusing on mtDNA. They find extensive deep late 
Pleistocene genetic link between contemporary Europeans and Indians, provided by mtDNA 
haplogroup U which encompasses roughly 1/5 of mtDNA in both populations. (Significantly the U 
haplotype was considered to be western Eurasian specific but this study suggests it is the 2 nd most 
common haplogroup in India). All samples are derived from African mtDNA 13 supporting an out 
of Africa model of human origins. They find that only a small fraction of Caucasoid specific 
mtDNA lineages can be ascribed to recent admixture. Basil et al’s 2003 argue that their results 
contradict these findings. 

Kivisild (2003) 11 studied mtDNA, Y chromosome markers and 1 autosomal locus in 180 
Chenchus and Koyas of Andhra Pradesh and compared them with 6 caste groups all over India 
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and populations of western and central Asians. They conclude - M and N haplogroups (Indian 
specific), coalesce early to late Pleistocene in South Asian. There was not a total replacement of 
earlier populations by later populations. Y chromosome data (H, L, R2 haplogroups) rarely found 
outside the subcontinent) were found in caste and tribal groups. Haplogroup Rla (putative Indo 
Aryan) has the highest frequency in Punjab, in Chenchus. So the origin of Rla was in South or 
Western Asia. They conclude: 

“Substantial heterogeneity obsen’ed in the haplogroup frequencies of the tribes and their 
generally lower haplotype and haplogroup diversity suggests that conclusions about 
Indian prehistory cannot be based on the examination of one or a few groups. Although, 
on a general scale, we can argue for largely the same prehistorical genetic inheritance in 
India tribal and caste populations, this does not refute the existence of genetic footprints 
laid down by known historical events.. ..It will take larger sample sizes, more 
populations, and increased molecular resolution to determine the likely modest impact of 
historical gene flows to India on its pre-existing large populations. ” 

Palanichamy et al (2004) '^focused on mtDNA and on the first hypervariable region - HVS 1. 
They sequenced and compared the sequences of 75 individuals. They argue that there are as many 
deepest branching lineages in the Indian samples as the western Eurasian mtDNA pool supporting 
the idea of earlier migrations than IE posits. 

Sahoo, et al (2006), 25 studied 936 individuals for Y chromosome variation - 32 tribes, 45 castes, 
38 SNPs. Their results support a deep Pleistocene maternal ancestry for Indian caste and tribal 
populations. They find no major influx from the west or north. Recent ancestral contributions to 
Dravidian and Hindi speaking caste groups appear low. They suggest an initial settlement of 
40,000 to 70,000 years ago along a southern route of Africa. Unlike Cordaux (2004) they find that 
caste populations of North and South are not significantly more closely related to each other than 
to other tribal populations. They find no differences between castes and tribes (contradicting 
several of the above studies). They do find evidence for large scale immigration and language 
change in the northeast. 

Sengupta et al, (2006) ~ 7 focus on 69 Y chromosome binary markers and 10 microsatellite markers 
in 728 Indian samples representing 36 populations, 17 tribal populations from six geographic 
regions and with differing social and linguistic categories. In addition they included 176 samples 
representing eight Pakistani and 175 East Asian samples from 18 populations. Their results do not 
support models of a recent genetic input from Central Asia but trace it to a deep ancestry in pre- 
Holocene and Holocene-era (not IE) expansions. Their results suggest a peninsular origin of 
Dravidian speakers. 

What is significant in all the studies that attempt to refute IE migration is evidence of a 
deep common ancestry of Indian and Western Eurasian lineages. ’ These 

differential findings also find their way into analyses of the Andaman and Nicobar 
islands. 28 

These findings also challenge current theories of migration from Africa. Were there one 
or multiple routes out of Africa and what routes did modern humans take in their 
migrations? While this is a large field of inquiry, it is important to note that questions 
about the origin of South Asian populations do affect the picture of the larger migrations. 
Authors such as Distoll 5 and Sahoo 25 have taken these data to suggest the Southern 
Route to Asia as a potential route from Africa suggesting separate origins for western 
Eurasian and southern Asian populations over 50,000 years ago. However, others such as 
Palanichamy et al find no evidence for a two route system from Africa and instead 
suggest a common migration. Recent work by anthropologists Pettraglia and James 20 
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suggests that India was peopled 70,000 years ago and modem humans replaced all earlier 
inhabitants. Others such as Roychoudhary 23 suggest that evidence from haplogroup M 

IV 

suggests that it arose in India and was carried to East Africa from India. Metspalu et al 
argue that India may well have provided the initial settlers who colonized much of 
Eurasia. 


2. Genetic Relationship of Castes and Tribes: In India alone, there are about 450 
tribal populations comprising roughly eight percent of the total population who 
are believed to speak 750 languages classified into three language families: 
Austro-Asiatic (AA), Dravidian (DR), and Tibeto-Burman (TB). 1 * 3 Studies of the 
caste system include one or all of four recognized Varnas: Brahmin, Kshatriya, 
Vysya, Sudra; yet there are believed to be over 4,600 caste groups in India, and 
members of tribal populations may also claim caste mebership. 4 In addition, some 
studies include religious communities such as Islam, Christianity, Sikhism, 
Jainism etc. Studies compare the genetic differences between “caste” groups and 
“tribes,” as well as geographic differences among castes and tribes, with a view 
toward understanding North/South differences. 

Some of the studies such as Basu et al 3 * suggest that tribal and caste populations are 
highly differentiated. Cordaux finds significant differences between caste and tribal 
populations in haplogroup distributions finding that Indian caste groups are more closely 
related to each other than to tribal groups. Sahoo et al ~~ by contrast found that there were 
no differences between castes and tribes. Bhattacharya 4 finds significant variation 
between castes and tribes but non-significant variation among castes. Basu 3 and 
Bamshad 2 and Cordaux 7 * * * find that upper castes are more closely related to central Asians 
than lower castes. Others such as Kivisild et aV s 11 results find no such divergence 
between caste and tribal populations and find variation extant in caste and tribal groups. 


General Concerns with the Interpretation of Genetic Evidence for Migrations 

What does the genetic evidence tell us? What should we interpret from it? Several 
factors need to be taken into consideration in interpreting this data; 

1. DNA evidence is often discussed as “scientific,” “unbiased,” “irrefutable,” and 

“objective.” However implicit in almost every study are social scientific and 

cultural assumptions. What is a population? What is a caste? What is a linguistic 

group? Who defines them? It is impossible to do research on social populations 

without making interpretive choices in defining target populations to be studied. 

Such interpretations are of necessity open to question. For example, in taking the 

case of the caste system, we know that it is an “elastic” system and that two castes 

that share the same name may have very different origins in different 

geographical regions. 13 Studies of caste in one region are not easily extrapolated 

to others. What counts as an upper or middle or lower “caste” group in one area 

may not translate into the same category in another. 
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2. Implicit in much of the analysis are assumptions about “Central Asian” genes, 
“European” genes, “Asian” genes as though these were easily definable and well 
determined. Furthermore often small sample sizes characterize each of these gene 
pools. Current studies continue to redefine each of these geographic areas. For 
example, the finding of sex specific migration patterns may affect how one 
interprets Y chromosome or mtDNA as one study did in central Asian 
populations. 19 Assumptions are made not only about South Asians but also about 
other definitions of “Europeans,” “Central Asians” etc. 

3. As stated in the summary, individual sample cells of most studies are very small. 
Many studies, because their sample sizes are so small, pool their data in reaching 
their conclusions yet make bold cases for trends within India. There is a need for 
caution in our interpretations when sampling large numbers of populations with 
individual communities or populations represented by just a few individuals. 
While we recognize the difficulty of such work, we also recognize the need to see 
hypotheses retested and for consensus on methods and results to develop within 
the field. This is only just beginning to happen. 

4. Y chromosome and mtDNA are unique in what they tell us. Their strength, in 
telling an uninterrupted story about paternal genealogy (father-father-father-father 
in the case of Y chromosomes) or maternal genealogy (mother-mother-mother in 
the case of mtDNA) is also a weakness. In each generation, we sample a smaller 
fraction of the ancestral lineage. While one samples half in Generation 1, one 
samples only 1/16,384 in Generation 14. The remaining 16,383 are absent in the 
data. Therefore while powerful in what they can tell us, they also do not tell us a 
lot about history. Complex or accidental acts of history may forever shift the 
genetic data. Therefore (as most of the studies acknowledge), the genetic evidence 
must be taken into consideration along with and not instead of other forms of 
linguistic, anthropological, and historical evidence. 


Conclusions : 

As we have seen, there is considerable variation in the genetic research on South Asia, 
and disagreement on what the results of these studies mean. One set of studies finds 
evidence for Indo-European migrations while a different set finds no evidence for a 
recent migration but traces the peopling of South Asia well into the Pleistocene. In part 
this may be due to the fact that different types of genetic markers and loci can have 
different mutation rates, which means that inferences about population history may 
reflect different points of time in the past. For example, the analysis of one locus might 
indicate a migration approximately 2000 years ago, and the analysis of another locus 
might indicate no migration, but be detecting population history from 10,000 years ago. 
Thus, while one study detects a migration and the other does not, the results may not be 
technically contradictory, but complex and difficult to piece together into a coherent 
story. The final picture depends on our assumptions about the mutation rates of each 
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locus. The task at hand is difficult: we are measuring the extant South Asian gene pool (the 
Indian population itself is greater than 1 billion) and surmising historical migrations over 
thousands of years and in some cases tens of thousands of years. Several points thus need to be 
made with respect to the studies. 

There is no consensus yet about which of the dramatically different pictures is 
correct. No study has been replicated or reanalyzed to show it is incorrect. The 
heterogeneity of the study samples is striking. The samples range from all over South 
Asia, spanning many distinct communities, populations, castes, and tribes. It is important 
to note that given the nature of the work, the sample sizes are usually not very large. 
Thus, while a study may sample 936 individuals, it includes individuals from 32 tribes 
and 45 castes. As a result, some communities are represented by very low numbers (e.g., 
one, two, or four individuals). This problem is pervasive in most of the above studies, 
except for those which focus on a particular region, community, or set of tribes (which 
are then often criticized for their particularity). 

As most authors note, the tracing of such migration patterns is not simple and 

involves many assumptions. For example, the fact that the U haplogroup was considered 

a western Eurasian specific haplogroup until 1999 and is now regarded as the second 

12 

most common haplogroup in India is a sign of the early stages of this line of research. 
One might arrive at the same patterns because of very different causal reasons. These 
include differing population sizes, founder effects, bottlenecks in populations, genetic 
drift, multiple migrations, migrations back and forth, and so on. Thus, the genetic studies 
thus far do not conclusively prove any theory. It is likely that over time as the genetic 
evidence accumulates, a more complete picture of the complex history of South Asia will 
emerge. 

In conclusion, genetic research opens a new avenue for understanding the genealogies 
and migration patterns of South Asia. The task at hand is difficult: We are measuring the 
extant Indian gene pool (the Indian population itself is greater than 1 billion) and 
surmising historical migrations over thousands of years and in some cases tens of 
thousands of years. While the data is interesting and informative, the crucial task is one 
of interpretation. In the present case, this is no longer the sole province of geneticists and 
population biologists, but also political activists, and individuals claiming inclusion in a 
particular ethnic, racial or national group. 5 ' 10 Given the politicization of this science, and 
given the divergent results of its application to date , we cannot easily resolve this issue at 
this time. This may be possible at a future date or it may be the case that the data will 
always be messy and open to multiple interpretations. If, and when there is a 
contradiction between archeological, historical, linguistic, and genetic evidence we must 
not assume that genetic studies trump other kinds of evidence (or vice versa). This issue 
calls for rigorous and thorough interdisciplinary analysis to gain an adequate 
understanding of human migrations using all data and lines of evidence (genetic, 
linguistic, anthropological, sociological, philosophical, and historical) available to us. 
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